-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SIMD 0054: Sysvar for active stake #56
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
@ripatel-jump - do you guys want to comment on this new sysvar idea?
@0xNineteen - are you also interested in attempting an implementation for this?
yeah im down to attempt an impl |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as proposed, this is going to be brittle around epoch boundaries. consider a transaction that was constructed from data queried prior to an epoch boundary, but broadcast and executed immediately afterwards. we probably need either the current and previous epoch in this sysvar, or use a pda to address (with epoch as a seed) the "sysvar" and just accumulate them in accountsdb indefinitely. latter may actually be better as something like an onchain vote can be ended in one epoch, then referenced to approve some future actions
|
||
### Ordering | ||
|
||
The sysvar structure would be sorted by `vote_account` in ascending order |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might pay to add an additional Vec<u16>
of indexes into the vote-account-address-sorted vector, which is instead sorted by stake to ease "first N% stake" lookups
|
||
## Detailed Design | ||
|
||
- sysvar structure: `Vec<(vote_account: Pubkey, active_stake_in_lamports: u64)>` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i would suggest an additional field
epoch_stake: u64
-- total active stake for the epoch
Implementing the proposed sysvar will enable new types of programs which are | ||
not possible now, | ||
improving Solana's ecosystem. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not a particularly valuable statement. elaborate with examples
modified on a per-epoch basis, validators will only need to update this | ||
account on epoch boundaries. | ||
|
||
We would also need a new feature gate to activate this sysvar. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move to security section. broken consensus -> loss of availability -> security
it needs to | ||
pass in all of the stake accounts which have delegated to it. This is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: there are a few truncated line breaks like this through out... would be good to clean them up
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The motivation section of this proposal misses some important use cases.
A more complete-list of use-cases for the active stake table is as follows:
- Querying the stake of individual validators form an on-chain program (this is mentioned by the protocol).
- Exporting epoch stake weights to an RPC client
- Proving epoch stake weights to an off-chain user (light clients)
- Deriving the leader schedule using this information
A sysvar-based mechanism is not an efficient or robust way to enable either of these 3.
Problem 1: Length restriction
The proposal limits the number of records stored in the new sysvar to 4096.
It is currently trivial to allocate more than 4096 vote acocunts, and in the future in within the grasp of a well-funded attacker.
This means that this mechanism might start to return incorrect data in the future, by omitting any vote accounts past the first 4096. This is mildly annoying in an on-chain context (1), or even dangerous in consensus-critical applications like bridges (such as 3).
Problem 2: Iterating all vote accounts on-chain is impractical
The computational capability of on-chain programs are quite limited. As the Solana network grows, and the number of validators increases past thousand, it becomes impractical to iterate through all vote accounts in a single transaction execution.
It doesn't seem like there is any practical reason for doing so anyways. Therefore, there is also no practical reason for why we should have to load a list of all vote accounts into virtual machine memory.
Problem 3: Memory copies
In order to make this account accessible from the runtime, the sysvar will have to be copied into the virtual machine input segment, which currently a bottleneck for transaction processing throughput. The Solana Labs virtual machine implementation is working on various mechanisms to allow host memory to be mapped directly into virtual machine memory. However, these mechanisms still require expensive operating syscalls (mmap()), or exotic use of hardware hypervisor functionality, and puts pressure on the TLB.
Instead of relying on future VM optimization, it is better to just map the subset of vote account records into VM memory that is needed (refer to problem 2).
Problem 4: Query complexity
Storing a flat array does not allow efficient queries of stake by pubkey. Finding an entry requires a linear walk. As established in problem 2, it is not safe to assume that such linear walks are practical.
Therefore, the only way to query a specific pubkey is to locate the index of such a record off-chain, and then provide this index via transaction instruction data.
Considering instruction data is unauthenticated, this requires a sanity check whether the requested index actually holds the vote account address that the program is looking for.
This is more complex off-chain and on-chain than alternative solutions.
Problem 5: Deriving the leader schedule is impractical
Deriving the leader schedule requires a mapping of (node identity) => (active stake)
. However, this sysvar only mentions vote account addresses, not node identities. Resolving those would require lookups in the respective vote accounts.
For RPC clients, it would be impractical to fetch the node identity of every vote account.
The same is true for light clients, especially when running in resource-constrained environments like ZKP VMs.
Problem 6: Unnecessary complexity in epoch boundary
The Solana runtime would have to implicitly update the new sysvar(s) as part of the epoch boundary. The epoch boundary already adds EpochStakes as implicit state to the bank, so doing it twice seems like unnecessary complexity. Why not offer an interface that provides access to the bank's existing EpochStakes data structure?
Have you considered adding RPC calls and syscalls that permit querying specific EpochStakes instead? Improving light client accessibility of implicit runtime state is best done via deterministic content-addressable data structures.
We also need to consider a maximum data size for the sysvar. | ||
Currently, there are 3422 vote accounts on mainnet (1818 active and 1604 delinquint), | ||
so we can use a maximum limit of 4096 entries and still include | ||
all the vote accounts for now. | ||
Using 4096 as the max number of entries the size would be (8 + 40 * 4096) = | ||
163,848 bytes. Once the number of entries exceeds the max allowed, | ||
vote accounts with the least amount of stake will be removed from the sysvar. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
im very suspicious of this coming back to bite us in the future. there arent 4096 validators today, but there could be someday, and it puts us in the unfortunate position of needing to decide whether to keep increasing the size of the sysvar, or make smaller validators second-class citizens. i especially dont like that this creates a disincentive to decentralization. this isnt a theoretical concern because (at least according to our marketing materials...) we already have over 2000 active voting nodes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
im very suspicious of this coming back to bite us in the future. there arent 4096 validators today, but there could be someday, and it puts us in the unfortunate position of needing to decide whether to keep increasing the size of the sysvar, or make smaller validators second-class citizens. i especially dont like that this creates a disincentive to decentralization. this isnt a theoretical concern because (at least according to our marketing materials...) we already have over 2000 active voting nodes
I agree, I think size of the sysvar should be much higher(or no limit if that doesn't introduce other problems) so that it can store all the vote accounts for the following reasons:
- If we truncate the sysvar to the top stakers, in the case that one of the voters becomes delinquent for a certain slot, we won't be able to verify consensus since the sysvar would update every epoch.
- Account size isn't a concern here because before reach that limit there would be other bottlenecks in the system as a consequence of having so many nodes in the network.
im a fan of the idea in spirit but i wonder about the best way to implement it... because the stakes cache already has this information computed in a form thats easy to look up, i wonder if it could be made queryable? an rpc call would (presumably, i havent worked in this part of the code much) be straightforward, but maybe theres a way to have a pseudo-program to retrieve the stake amount for a specific vote account in the manner of |
Another important use case that a large section of the validator community is interested in is the ability to enable stake-weighted on-vhain voting for a DAO for governance. Mentioning it for completeness' sake |
we would also like this information available on-chain for something |
Yep this SIMD is about making that information available on chain. Any RPC endpoint discussion can be secondary, or moved elsewhere |
|
||
### Changes Required | ||
|
||
Stake weight information should already be available on full node clients |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this sysvar contain stakes before or after reward calculation ?
This SIMD seems to have gone somewhat stale, what do we need to push this through? This is important to being able to conduct on-chain stake weighted governance voting by validators |
@t-nelson any thoughts on this? Since we spoke at the core dev mixer I think the concerns about space were addressed. |
@michaelh-laine The problem with this proposal is that it's quite inefficient (it adds a large account). It also loses correctness if the number of vote accounts increases in the future. I suggest using account compression here (Solana's silly name for "hash tree"), which is a common solution for making unbounded data accessible on-chain. |
@ripatel-fd even with 10k vote accounts the size should be ~400 kB. |
@riptl Given account compression being deprioritized short term, and per @anoushk1234, "with 10k vote accounts the size should be ~400 kB", does it make sense to move this SIMD forward to enable on-chain stake weighted governance voting as @michaelh-laine mentioned above? @0xNineteen still interested in attempting an implementation? |
Maybe consider #133 as an alternative? |
What’s the difference? |
@michaelh-laine Same query is available to BPF programs via syscall, but the data is not on-chain via sysvar account. |
to comment my thoughts: this simd was initially thought to be a simple but there was a lot of feedback that the sysvar approach wasnt the right way to go - i havent been too involved in the discussions lately but it seems like @buffalojoec SIMD #133 will be the better way forward |
Thanks @0xNineteen for the update. @buffalojoec SIMD-0133 could supersede this for the on-chain stake weights use case, unless hearing otherwise from the community. |
@0xSol I don't think this SIMD needs to be superseded as it never was accepted. We should be good to just close this SIMD. |
Close per above comments. |
We propose to add a new sysvar that contains vote account pubkeys and their corresponding total active stake. This will enable on-chain programs to verify a validator's total active stake.